FREE online courses on Information Technology - Chapter 7 IT – DATABASE
MANAGEMENT
Database Management
A typical organization has a large number of files, many of
which may be stored on a computer device. We call these data machine readable
because you can use a computer to process them. Paper files, on the other hand,
are much less accessible. Companies frequently refer to related files as a part
of database. This term may be used generically, or it may refer to a specific
system or database management software. Examples of database systems for
personal computers include Dbase, Paradox, and Access. Data bases for midrange
systems include Oracle and Sybase, and IBM's mainframe database is called DB2.
Computers store data in a file, which can be defined simply
as a collection of data. A computer file is organized in a particular way with a
well-defined structure to the information in the file. A computer file consists
of a collection of records, each of which is made up of fields. The various
fields consist of groups of characters, as described below.
The smallest unit of
storage of interest is the character, for example, the number 9 or the letter A.
We generally do not work directly with characters but rather with groups of
characters that have some intrinsic meaning, such as Smith or 599. These
groupings of characters are called fields, and we identify them with a name.
Smith is an employee's surname and 599 is Smith's department number.
Group of fields are combined to form a logical record, such
as the one shown in figure below. This logical record contains all the data of
interest about some entity. In this example. It has all the data in the field
about an individual employee. A key to a record is a specific field of interest.
Many files are organized on a key: Last name is the primary key for a telephone
book. That is, the telephone book is arranged in the alphabetical order based on
telephone subscribers' last names. A secondary key, in the case of the telephone
book, is the person's first name or initial. The telephone book, then, is
arranged in sequence on the primary key (last name), and within the primary key
it is arranged in order by the secondary key (first name). Fields designated as
keys are also used as a basis for retrieving information from a file. For
example, an inventory part number may be the key for retrieving information from
a file. For example, an inventory part number may be the key for retrieving
information from an inventory file about the quantity of the part on hand.
Example:
Smith, D.J.
|
599
|
031875
|
250
|
C
|
G
|
Field
Name
Department Birthday
Salary Occupation Last job
Code Code
Figure: A logical
fixed-length record.
There are two major types of files: sequential and direct
access. Sequential files were the first type of secondary storage. All records
are kept in some sequence such as in order by Social Secondary number. Most of
us will encounter sequential access files only in special circumstances. Records
in this type of file are located one after another according to a given
sequence, for example, the record with payroll number 1 is followed by the
record with number 2, etc. With a payroll number 1 is followed by the record
with number 2, etc. With a sequential file, you cannot find a specific record,
such as the person with payroll number 127, unless you read the entire file
until you locate a record with payroll number 127. On the average, if there are
n records in the file, you will read n/2 records to find the one you are
seeking.
A direct access files uses a physical medium and programming
which facilitate the storage and retrieval of specific records. These files are
at the heart of database management systems and of most of today's file storage
technology.
In a sequential file, finding the data you want is not too
difficult, though it may be time-consuming. Each record is in a sequence, so you
simply read the file until you get to the location of the record of interest.
(This is the reason that sequential files are associated with batch processing.
You update the file at one point in time and make all of the changes, reading
the file just once one and creating a new version.)
The major advantage of the direct access file is what its
name implies. You can locate any record in the file in roughly the same, short
(milliseconds) period of time. For
example, when you call the airline, they want to access the inventory of seats
for the flight you want to take on the date you want to fly without having you
or the agent wait on the phone.
If we request a record number, the file management software
will supply it for us, then we must associate the logical record number with the
information desired. For example, in an inventory application, how do we know
where information on inventory part number 1432 is located? What first record on
the file and read each record until we find part 1432, but this is simply
scanning the file sequentially. One solution to our problem is to create an
index like the index to a book. The computer looks up the logical address for
part 1432 is the index, and then retrieves that record from the disk.
Key
Index entry
Record address
1432
1432-312
312
4293
4293-137
137
We search the index in primary memory (which is several orders of magnitude
faster than searching the disk itself), looking for the key. The index entry
tells at what record that key is located.
Creating complex files using the techniques described above
and many others is a tedious and error-prone process. In the 1960s, software
vendors developed products called
database management systems (DBMSs). These examples of systems software
automate many of the tasks associated with using direct access files. As with
other types of software originally developed for large computers, today there
exist a large number of sophisticated DBMSs for personal computers.
A DBMS has to provide:
¨
A method for defining the contents of the
database.
¨
A way to describe relationships among data
elements and records
¨
A mechanism to set up the database in the first
place.
Ways to
manipulate the data including:
¨
Updating (adding, modifying, and/or deleting
information).
¨
Retrieval using complex criteria to select data.